Machine Learning Pipelines

Machine Learning (ML) is centered around constructing models capable of automating diverse tasks. Such tasks can vary from detecting fraudulent transactions and identifying parking lots in satellite imagery, to translating text between languages or offering pricing suggestions for cargo transportation.

Although certain tasks may be performed better by humans, machines excel at executing them rapidly, continuously and without getting bored.

As a programmer, you can think of a ML model as a function that picks up arguments and returns with an answer, similar to traditional programming. The distinction lies in the fact that instead of being coded by a human it is rather a black box derived from extensive data. The model includes a significant amount of data encapsulated by code to interpret that information into a function with multiple variables. 

This analogy draws a parallel between machine learning pipelines and continuous integration / delivery pipelines in software development. Both types of pipelines compile source code into executable artifacts.

There are exceptions to this though, such as: 

●      In software we mostly work with codebases, whereas in machine learning models and big amounts of data are being managed.

●      Software can be tested well - the build either passes or fails. Machine learning models on the other hand are always inaccurate to some degree.

●      Data can be wrong and can become outdated, as can models.

Let me give you an example of how models can become outdated. In 2020, all the models that suggested prices for delivering cargo between two locations started giving wrong answers. Why did this happen? It was because of Brexit. Businesses in the UK started stocking up on supplies before the borders were shut, which created a higher demand, leading to higher prices. Since the models were trained on past data, they didn't know how to handle this new situation.

As software engineers, we care about making sure our code works perfectly in every possible situation. On the other hand, data scientists must be okay with not knowing everything and dealing with differences in data. ML is about finding patterns in data, which can be challenging because it's not always clear what the data means. This makes machine learning more flexible for complex problems, but also harder to understand and fix if something goes wrong. 

Most problems that software engineers solve today will continue to be solved using traditional programming in the future. However, ML can be used to solve new types of problems that couldn't be solved before and can be a great addition to software development. As long as sufficient amounts of data are available and it has been tagged with a desired outcome.

What are machine learning pipelines?

Machine learning is a way of teaching computers to do things that are normally done by humans. It's like a set of instructions that the computer can use to learn and make decisions based on data. Machine learning pipelines are a series of steps that turn data into a trained and tested machine learning model. The pipeline involves getting data, changing the data to be good for training, creating a model, and packaging it so it can be used for a long time. The model can then be made available to others through an API.

A machine learning pipeline is like a big machine that takes in data and gives out predictions. It's made up of different parts, like the data that comes in, the way the data is changed so the machine can understand it, the machine learning model that does the thinking, and the output that the machine gives. All these parts work together to make sure the machine can make good predictions.

This is how machine learning pipelines could look like in a simple form:

  1. Download data from some source. Usually, it will be a set of datastore rows or records.

  2. Convert data to a format suitable for training: select features (arguments for the model), remove noise and bad records. Some fields in the dataset will be used as input and others will be specified as the desired output that we want to predict.

  3. Define the model format (smell the wind and say “this big equation with a lot of variables will get the job done”) and train the model on data (tweak formula variables in semi-random way until the model starts accurately guessing results given inputs).

  4. Package the model into a durable format (e.g., a Docker container with some binary blob).

  5. Optionally, deploy the model as a service with an API.

 

Normally these transformations are codified as workflows (workflow as code), versioned and deployed. In simpler projects, one can implement them with Bash or Python scripts. Larger projects and teams are well advised to use something that is better documented and based on conventions (for example via domain-specific language, or declaratives syntax).

Long story short: Machine learning pipelines are codified workflows that ingest data, transform, and derive reusable models from it. Their goal is similar to CI/CD pipelines in software engineering: automate, ensure repeatability and scale processes. Implementation details differ from CI/CD because machine learning pipelines work mostly with data.

Why are machine learning pipelines important?

In a regular system design, all the tasks would be performed together in a single program. This means that the same code would be used to collect, clean, model, and deploy the data. Because machine learning models generally have less code than other software programs, it makes sense to keep everything in one place.

In the ML pipeline, every step of your work process is made into its own separate service. This means that when you want to create a new workflow, you can select the specific parts you need and use them wherever you want. Any updates or changes made to a service will be done at a higher level, making it easier and more efficient to manage.

Machine learning pipelining can solve several problems. It allows for more efficient scaling of ML workflows. Rather than having to repeat the entire process for each new model, pipelining enables the reuse of the same data preparation and processing steps. Also, by allowing you to update individual components without affecting the rest of the pipeline, ML pipelining can help with version control.

Considering workflow efficiency: Breaking down a machine learning workflow into smaller, reusable components can save a lot of time. And last but not least, with machine learning pipelining, teams can collaborate on individual parts of a workflow without worrying about how their changes will affect the entire process.

Contact

Christoph Hasenzagl
TIMETOACT GROUP Österreich GmbHContact
Felix KrauseBlog
Blog

Creating a Cross-Domain Capable ML Pipeline

As classifying images into categories is a ubiquitous task occurring in various domains, a need for a machine learning pipeline which can accommodate for new categories is easy to justify. In particular, common general requirements are to filter out low-quality (blurred, low contrast etc.) images, and to speed up the learning of new categories if image quality is sufficient. In this blog post we compare several image classification models from the transfer learning perspective.

Rinat AbdullinRinat AbdullinBlog
Blog

State of Fast Feedback in Data Science Projects

DSML projects can be quite different from the software projects: a lot of R&D in a rapidly evolving landscape, working with data, distributions and probabilities instead of code. However, there is one thing in common: iterative development process matters a lot.

Felix KrauseBlog
Blog

Part 2: Detecting Truck Parking Lots on Satellite Images

In the previous blog post, we created an already pretty powerful image segmentation model in order to detect the shape of truck parking lots on satellite images. However, we will now try to run the code on new hardware and get even better as well as more robust results.

Felix KrauseBlog
Blog

Part 1: Detecting Truck Parking Lots on Satellite Images

Real-time truck tracking is crucial in logistics: to enable accurate planning and provide reliable estimation of delivery times, operators build detailed profiles of loading stations, providing expected durations of truck loading and unloading, as well as resting times. Yet, how to derive an exact truck status based on mere GPS signals?

Felix KrauseBlog
Blog

Boosting speed of scikit-learn regression algorithms

The purpose of this blog post is to investigate the performance and prediction speed behavior of popular regression algorithms, i.e. models that predict numerical values based on a set of input variables.

Rinat AbdullinRinat AbdullinBlog
Blog

Part 1: TIMETOACT Logistics Hackathon - Behind the Scenes

A look behind the scenes of our Hackathon on Sustainable Logistic Simulation in May 2022. This was a hybrid event, running on-site in Vienna and remotely. Participants from 12 countries developed smart agents to control cargo delivery truck fleets in a simulated Europe.

Rinat AbdullinRinat AbdullinBlog
Blog

Let's build an Enterprise AI Assistant

In the previous blog post we have talked about basic principles of building AI assistants. Let’s take them for a spin with a product case that we’ve worked on: using AI to support enterprise sales pipelines.

Referenz
Referenz

Automated Planning of Transport Routes

Efficient transport route planning through automation and seamless integration.

Rinat AbdullinRinat AbdullinBlog
Blog

So You are Building an AI Assistant?

So you are building an AI assistant for the business? This is a popular topic in the companies these days. Everybody seems to be doing that. While running AI Research in the last months, I have discovered that many companies in the USA and Europe are building some sort of AI assistant these days, mostly around enterprise workflow automation and knowledge bases. There are common patterns in how such projects work most of the time. So let me tell you a story...

Daniel PuchnerBlog
Blog

Make Your Value Stream Visible Through Structured Logging

Boost your value stream visibility with structured logging. Improve traceability and streamline processes in your software development lifecycle.

Blog
Blog

Second Place - AIM Hackathon 2024: Trustpilot for ESG

The NightWalkers designed a scalable tool that assigns trustworthiness scores based on various types of greenwashing indicators, including unsupported claims and inaccurate data.

Nina DemuthBlog
Blog

From the idea to the product: The genesis of Skwill

We strongly believe in the benefits of continuous learning at work; this has led us to developing products that we also enjoy using ourselves. Meet Skwill.

Daniel PuchnerBlog
Blog

How we discover and organise domains in an existing product

Software companies and consultants like to flex their Domain Driven Design (DDD) muscles by throwing around terms like Domain, Subdomain and Bounded Context. But what lies behind these buzzwords, and how these apply to customers' diverse environments and needs, are often not as clear. As it turns out it takes a collaborative effort between stakeholders and development team(s) over a longer period of time on a regular basis to get them right.

Rinat AbdullinRinat AbdullinBlog
Blog

Learning + Sharing at TIMETOACT GROUP Austria

Discover how we fosters continuous learning and sharing among employees, encouraging growth and collaboration through dedicated time for skill development.

Blog
Blog

ChatGPT & Co: LLM Benchmarks for September

Find out which large language models outperformed in the September 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Martin WarnungMartin WarnungBlog
Blog

Common Mistakes in the Development of AI Assistants

How fortunate that people make mistakes: because we can learn from them and improve. We have closely observed how companies around the world have implemented AI assistants in recent months and have, unfortunately, often seen them fail. We would like to share with you how these failures occurred and what can be learned from them for future projects: So that AI assistants can be implemented more successfully in the future!

Jörg EgretzbergerJörg EgretzbergerBlog
Blog

8 tips for developing AI assistants

AI assistants for businesses are hype, and many teams were already eagerly and enthusiastically working on their implementation. Unfortunately, however, we have seen that many teams we have observed in Europe and the US have failed at the task. Read about our 8 most valuable tips, so that you will succeed.

Rinat AbdullinRinat AbdullinBlog
Blog

Using NLP libraries for post-processing

Learn how to analyse sticky notes in miro from event stormings and how this analysis can be carried out with the help of the spaCy library.

Bernhard SchauerBlog
Blog

ADRs as a Tool to Build Empowered Teams

Learn how we use Architecture Decision Records (ADRs) to build empowered, autonomous teams, enhancing decision-making and collaboration.

Peter SzarvasPeter SzarvasBlog
Blog

Why Was Our Project Successful: Coincidence or Blueprint?

“The project exceeded all expectations,” is one among our favourite samples of the very positive feedback from our client. Here's how we did it!

Christian FolieBlog
Blog

The Power of Event Sourcing

This is how we used Event Sourcing to maintain flexibility, handle changes, and ensure efficient error resolution in application development.

Jonathan ChannonBlog
Blog

Tracing IO in .NET Core

Learn how we leverage OpenTelemetry for efficient tracing of IO operations in .NET Core applications, enhancing performance and monitoring.

Rinat AbdullinRinat AbdullinBlog
Blog

Consistency and Aggregates in Event Sourcing

Learn how we ensures data consistency in event sourcing with effective use of aggregates, enhancing system reliability and performance.

Blog
Blog

My Workflows During the Quarantine

The current situation has deeply affected our daily lives. However, in retrospect, it had a surprisingly small impact on how we get work done at TIMETOACT GROUP Austria.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 7

Explore LINQ and query expressions in F#. Simplify data manipulation and enhance your functional programming skills with this guide.

Laura GaetanoBlog
Blog

My Weekly Shutdown Routine

Discover my weekly shutdown routine to enhance productivity and start each week fresh. Learn effective techniques for reflection and organization.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 8

Discover Units of Measure and Type Providers in F#. Enhance data management and type safety in your applications with these powerful tools.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 9

Explore Active Patterns and Computation Expressions in F#. Enhance code clarity and functionality with these advanced techniques.

Rinat AbdullinRinat AbdullinBlog
Blog

Innovation Incubator at TIMETOACT GROUP Austria

Discover how our Innovation Incubator empowers teams to innovate with collaborative, week-long experiments, driving company-wide creativity and progress.

Laura GaetanoBlog
Blog

Using a Skill/Will matrix for personal career development

Discover how a Skill/Will Matrix helps employees identify strengths and areas for growth, boosting personal and professional development.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 6

Learn error handling in F# with option types. Improve code reliability using F#'s powerful error-handling techniques.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 10

Discover Agents and Mailboxes in F#. Build responsive applications using these powerful concurrency tools in functional programming.

Rinat AbdullinRinat AbdullinBlog
Blog

Process Pipelines

Discover how process pipelines break down complex tasks into manageable steps, optimizing workflows and improving efficiency using Kanban boards.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 11

Learn type inference and generic functions in F#. Boost efficiency and flexibility in your code with these essential programming concepts.

Rinat AbdullinRinat AbdullinBlog
Blog

Innovation Incubator Round 1

Team experiments with new technologies and collaborative problem-solving: This was our first round of the Innovation Incubator.

Rinat AbdullinRinat AbdullinBlog
Blog

Announcing Domain-Driven Design Exercises

Interested in Domain Driven Design? Then this DDD exercise is perfect for you!

Laura GaetanoBlog
Blog

5 lessons from running a (remote) design systems book club

Last year I gifted a design systems book I had been reading to a friend and she suggested starting a mini book club so that she’d have some accountability to finish reading the book. I took her up on the offer and so in late spring, our design systems book club was born. But how can you make the meetings fun and engaging even though you're physically separated? Here are a couple of things I learned from running my very first remote book club with my friend!

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 2

Explore functions, types, and modules in F#. Enhance your skills with practical examples and insights in this detailed guide.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 4

Unlock F# collections and pipelines. Manage data efficiently and streamline your functional programming workflow with these powerful tools.

Rinat AbdullinRinat AbdullinBlog
Blog

5 Inconvenient Questions when hiring an AI company

This article discusses five questions you should ask when buying an AI. These questions are inconvenient for providers of AI products, but they are necessary to ensure that you are getting the best product for your needs. The article also discusses the importance of testing the AI system on your own data to see how it performs.

Rinat AbdullinRinat AbdullinBlog
Blog

LLM Performance Series: Batching

Beginning with the September Trustbit LLM Benchmarks, we are now giving particular focus to a range of enterprise workloads. These encompass the kinds of tasks associated with Large Language Models that are frequently encountered in the context of large-scale business digitalization.

Daniel WellerBlog
Blog

Revolutionizing the Logistics Industry

As the logistics industry becomes increasingly complex, businesses need innovative solutions to manage the challenges of supply chain management, trucking, and delivery. With competitors investing in cutting-edge research and development, it is vital for companies to stay ahead of the curve and embrace the latest technologies to remain competitive. That is why we introduce the TIMETOACT Logistics Simulator Framework, a revolutionary tool for creating a digital twin of your logistics operation.

Rinat AbdullinRinat AbdullinBlog
Blog

Event Sourcing with Apache Kafka

For a long time, there was a consensus that Kafka and Event Sourcing are not compatible with each other. So it might look like there is no way of working with Event Sourcing. But there is if certain requirements are met.

Rinat AbdullinRinat AbdullinBlog
Blog

Strategic Impact of Large Language Models

This blog discusses the rapid advancements in large language models, particularly highlighting the impact of OpenAI's GPT models.

Chrystal LantnikBlog
Blog

CSS :has() & Responsive Design

In my journey to tackle a responsive layout problem, I stumbled upon the remarkable benefits of the :has() pseudo-class. Initially, I attempted various other methods to resolve the issue, but ultimately, embracing the power of :has() proved to be the optimal solution. This blog explores my experience and highlights the advantages of utilizing the :has() pseudo-class in achieving flexible layouts.

Ian RussellIan RussellBlog
Blog

So, I wrote a book

Join me as I share the story of writing a book on F#. Discover the challenges, insights, and triumphs along the way.

Nina DemuthBlog
Blog

7 Positive effects of visualizing the interests of your team

Interests maps unleash hidden potentials and interests, but they also make it clear which topics are not of interest to your colleagues.

Daniel PuchnerBlog
Blog

How to gather data from Miro

Learn how to gather data from Miro boards with this step-by-step guide. Streamline your data collection for deeper insights.

Christian FolieBlog
Blog

Designing and Running a Workshop series: An outline

Learn how to design and execute impactful workshops. Discover tips, strategies, and a step-by-step outline for a successful workshop series.

Christian FolieBlog
Blog

Designing and Running a Workshop series: The board

In this part, we discuss the basic design of the Miro board, which will aid in conducting the workshops.

Sebastian BelczykBlog
Blog

Composite UI with Design System and Micro Frontends

Discover how to create scalable composite UIs using design systems and micro-frontends. Enhance consistency and agility in your development process.

Rinat AbdullinRinat AbdullinBlog
Blog

The Intersection of AI and Voice Manipulation

The advent of Artificial Intelligence (AI) in text-to-speech (TTS) technologies has revolutionized the way we interact with written content. Natural Readers, standing at the forefront of this innovation, offers a comprehensive suite of features designed to cater to a broad spectrum of needs, from personal leisure to educational support and commercial use. As we delve into the capabilities of Natural Readers, it's crucial to explore both the advantages it brings to the table and the ethical considerations surrounding voice manipulation in TTS technologies.

Aqeel AlazreeBlog
Blog

Database Analysis Report

This report comprehensively analyzes the auto parts sales database. The primary focus is understanding sales trends, identifying high-performing products, Analyzing the most profitable products for the upcoming quarter, and evaluating inventory management efficiency.

Aqeel AlazreeBlog
Blog

Part 4: Save Time and Analyze the Database File

ChatGPT-4 enables you to analyze database contents with just two simple steps (copy and paste), facilitating well-informed decision-making.

Aqeel AlazreeBlog
Blog

Part 3: How to Analyze a Database File with GPT-3.5

In this blog, we'll explore the proper usage of data analysis with ChatGPT and how you can analyze and visualize data from a SQLite database to help you make the most of your data.

Aqeel AlazreeBlog
Blog

Part 2: Data Analysis with powerful Python

Analyzing and visualizing data from a SQLite database in Python can be a powerful way to gain insights and present your findings. In Part 2 of this blog series, we will walk you through the steps to retrieve data from a SQLite database file named gold.db and display it in the form of a chart using Python. We'll use some essential tools and libraries for this task.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F#

Dive into functional programming with F# in our introductory series. Learn how to solve real business problems using F#'s functional programming features. This first part covers setting up your environment, basic F# syntax, and implementing a simple use case. Perfect for developers looking to enhance their skills in functional programming.

Aqeel AlazreeBlog
Blog

Part 1: Data Analysis with ChatGPT

In this new blog series we will give you an overview of how to analyze and visualize data, create code manually and how to make ChatGPT work effectively. Part 1 deals with the following: In the data-driven era, businesses and organizations are constantly seeking ways to extract meaningful insights from their data. One powerful tool that can facilitate this process is ChatGPT, a state-of-the-art natural language processing model developed by OpenAI. In Part 1 pf this blog, we'll explore the proper usage of data analysis with ChatGPT and how it can help you make the most of your data.

Felix KrauseBlog
Blog

License Plate Detection for Precise Car Distance Estimation

When it comes to advanced driver-assistance systems or self-driving cars, one needs to find a way of estimating the distance to other vehicles on the road.

Matus ZilinskyBlog
Blog

Creating a Social Media Posts Generator Website with ChatGPT

Using the GPT-3-turbo and DALL-E models in Node.js to create a social post generator for a fictional product can be really helpful. The author uses ChatGPT to create an API that utilizes the openai library for Node.js., a Vue component with an input for the title and message of the post. This article provides step-by-step instructions for setting up the project and includes links to the code repository.

Christian FolieBlog
Blog

Running Hybrid Workshops

When modernizing or building systems, one major challenge is finding out what to build. In Pre-Covid times on-site workshops were a main source to get an idea about ‘the right thing’. But during Covid everybody got used to working remotely, so now the question can be raised: Is it still worth having on-site, physical workshops?

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 3

Dive into F# data structures and pattern matching. Simplify code and enhance functionality with these powerful features.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 12

Explore reflection and meta-programming in F#. Learn how to dynamically manipulate code and enhance flexibility with advanced techniques.

Rinat AbdullinRinat AbdullinBlog
Blog

Celebrating achievements

Our active memory can be like a cache of recently used data; fresh ideas & frustrations supersede older ones. That's why celebrating achievements is key for your success.

Ian RussellIan RussellBlog
Blog

Introduction to Web Programming in F# with Giraffe – Part 1

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Ian RussellIan RussellBlog
Blog

Introduction to Web Programming in F# with Giraffe – Part 2

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Ian RussellIan RussellBlog
Blog

Introduction to Partial Function Application in F#

Partial Function Application is one of the core functional programming concepts that everyone should understand as it is widely used in most F# codebases.In this post I will introduce you to the grace and power of partial application. We will start with tupled arguments that most devs will recognise and then move onto curried arguments that allow us to use partial application.

Rinat AbdullinRinat AbdullinBlog
Blog

Inbox helps to clear the mind

I hate distractions. They can easily ruin my day when I'm in the middle of working on a cool project. They do that by overloading my mind, buzzing around inside me, and just making me tired. Even though we can think about several things at once, we can only do one thing at a time.

Sebastian BelczykBlog
Blog

Building and Publishing Design Systems | Part 2

Learn how to build and publish design systems effectively. Discover best practices for creating reusable components and enhancing UI consistency.

Sebastian BelczykBlog
Blog

Building a micro frontend consuming a design system | Part 3

In this blopgpost, you will learn how to create a react application that consumes a design system.

Sebastian BelczykBlog
Blog

Building A Shell Application for Micro Frontends | Part 4

We already have a design system, several micro frontends consuming this design system, and now we need a shell application that imports micro frontends and displays them.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 5

Master F# asynchronous workflows and parallelism. Enhance application performance with advanced functional programming techniques.

Ian RussellIan RussellBlog
Blog

Introduction to Web Programming in F# with Giraffe – Part 3

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Balazs MolnarBalazs MolnarBlog
Blog

Learn & Share video Obsidian

Knowledge is very powerful. So, finding the right tool to help you gather, structure and access information anywhere and anytime, is rather a necessity than an option. You want to accomplish your tasks better? You want a reliable tool which is easy to use, extendable and adaptable to your personal needs? Today I would like to introduce you to the knowledge management system of my choice: Obsidian.

Nina DemuthBlog
Blog

They promised it would be the next big thing!

Haven’t we all been there? We have all been promised by teachers, colleagues or public speakers that this or that was about to be the next big thing in tech that would change the world as we know it.

Jonathan ChannonBlog
Blog

Understanding F# Type Aliases

In this post, we discuss the difference between F# types and aliases that from a glance may appear to be the same thing.

Rinat AbdullinRinat AbdullinBlog
Blog

Open-sourcing 4 solutions from the Enterprise RAG Challenge

Our RAG competition is a friendly challenge different AI Assistants competed in answering questions based on the annual reports of public companies.

Bernhard SchauerBlog
Blog

Isolating legacy code with ArchUnit tests

Clear boundaries in code are important ... and hard. ArchUnit allows you to capture the structure your team agreed on in tests.

Jonathan ChannonBlog
Blog

Understanding F# applicatives and custom operators

In this post, Jonathan Channon, a newcomer to F#, discusses how he learnt about a slightly more advanced functional concept — Applicatives.

Ian RussellIan RussellBlog
Blog

Creating solutions and projects in VS code

In this post we are going to create a new Solution containing an F# console project and a test project using the dotnet CLI in Visual Studio Code.

Felix KrauseBlog
Blog

AIM Hackathon 2024: Sustainability Meets LLMs

Focusing on impactful AI applications, participants addressed key issues like greenwashing detection, ESG report relevance mapping, and compliance with the European Green Deal.

Blog
Blog

SAM Wins First Prize at AIM Hackathon

The winning team of the AIM Hackathon, nexus. Group AI, developed SAM, an AI-powered ESG reporting platform designed to help companies streamline their sustainability compliance.

Blog
Blog

Third Place - AIM Hackathon 2024: The Venturers

ESG reports are often filled with vague statements, obscuring key facts investors need. This team created an AI prototype that analyzes these reports sentence-by-sentence, categorizing content to produce a "relevance map".

Ian RussellIan RussellBlog
Blog

Ways of Creating Single Case Discriminated Unions in F#

There are quite a few ways of creating single case discriminated unions in F# and this makes them popular for wrapping primitives. In this post, I will go through a number of the approaches that I have seen.

Ian RussellIan RussellBlog
Blog

Using Discriminated Union Labelled Fields

A few weeks ago, I re-discovered labelled fields in discriminated unions. Despite the fact that they look like tuples, they are not.

Blog
Blog

ChatGPT & Co: LLM Benchmarks for October

Find out which large language models outperformed in the October 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

TIMETOACT
Service
Navigationsbild zu Data Science
Service

Data Science, Artificial Intelligence and Machine Learning

For some time, Data Science has been considered the supreme discipline in the recognition of valuable information in large amounts of data. It promises to extract hidden, valuable information from data of any structure.

TIMETOACT
Technologie
Headerbild IBM Cloud Pak for Data
Technologie

IBM Cloud Pak for Data

The Cloud Pak for Data acts as a central, modular platform for analytical use cases. It integrates functions for the physical and virtual integration of data into a central data pool - a data lake or a data warehouse, a comprehensive data catalogue and numerous possibilities for (AI) analysis up to the operational use of the same.

TIMETOACT
Technologie
Headerbild zu IBM Cognos Analytics 11
Technologie

IBM Cognos Analytics 11

IBM Cognos Analytics is a central platform for the provision of dispositive information in the company. With the reporting and analysis functions of IBM Cognos, the relevant information can be prepared and used throughout the company.

TIMETOACT
Technologie
Headerbild zu IBM DataStage
Technologie

IBM InfoSphere Information Server

IBM Information Server is a central platform for enterprise-wide information integration. With IBM Information Server, business information can be extracted, consolidated and merged from a wide variety of sources.

TIMETOACT
Technologie
Headerbild zu IBM DB2
Technologie

IBM Db2

The IBM Db2database has been established on the market for many years as the leading data warehouse database in addition to its classic use in operations.

TIMETOACT
Technologie
Headerbild zu IBM Netezza Performance Server
Technologie

IBM Netezza Performance Server

IBM offers Database technology for specific purposes in the form of appliance solutions. In the Data Warehouse environment, the Netezza technology, later marketed under the name "IBM PureData for Analytics", is particularly well known.

TIMETOACT
Technologie
Headerbild zu IBM Planning Analytics mit Watson
Technologie

IBM Planning Analytics mit Watson

IBM Planning Analytics with Watsons enables the automation of planning, budgeting, forecasting and analysis processes using IBM TM1.

TIMETOACT
Technologie
Headerbild für IBM SPSS
Technologie

IBM SPSS Modeler

IBM SPSS Modeler is a tool that can be used to model and execute tasks, for example in the field of Data Science and Data Mining, via a graphical user interface.

TIMETOACT
Technologie
Headerbild zu IBM Watson Studio
Technologie

IBM Watson Studio

IBM Watson Studio is an integrated solution for implementing a data science landscape. It helps companies to structure and simplify the process from exploratory analysis to the implementation and operationalisation of the analysis processes.

TIMETOACT
Technologie
Headerbild zu IBM Watson® Knowledge Catalog
Technologie

IBM Watson® Knowledge Catalog/Information Governance Catalog

Today, "IGC" is a proprietary enterprise cataloging and metadata management solution that is the foundation of all an organization's efforts to comply with rules and regulations or document analytical assets.

TIMETOACT
Technologie
Headerbild zu IBM Decision Optimization
Technologie

Decision Optimization

Mathematical algorithms enable fast and efficient improvement of partially contradictory specifications. As an integral part of the IBM Data Science platform "Cloud Pak for Data" or "IBM Watson Studio", decision optimisation has been decisively expanded and embedded in the Data Science process.